Skip to content

Conversation

Kludex
Copy link
Member

@Kludex Kludex commented Aug 25, 2025

Copy link

github-actions bot commented Aug 25, 2025

Docs Preview

commit: f600e08
Preview URL: https://f7a70c72-pydantic-ai-previews.pydantic.workers.dev

@certainly-param
Copy link
Contributor

Hi! I've been looking into citation support for OpenRouter (related to #3126) and discovered something interesting while investigating the codebase.

The OpenAI Chat Completions API actually has built-in support for citations through the annotations field on ChatCompletionMessage. The structure includes:

ChatCompletionMessage.annotations: Optional[List[Annotation]]
  └─ Annotation
      ├─ type: Literal['url_citation']
      └─ url_citation: AnnotationURLCitation
          ├─ url: str
          ├─ title: str
          ├─ start_index: int
          └─ end_index: int

I noticed that the current implementation in _process_response() (around line 576-580) processes choice.message.content but doesn't handle choice.message.annotations. This means citations from Chat Completions responses are being dropped.

Since providers like OpenRouter, Azure, Deepseek, Grok, Ollama, and others all use OpenAIChatModel under the hood, adding citation support to the Chat Completions path would automatically enable citations for all of them.

The implementation would be similar to what's already done for the Responses API (lines 799-810), though there's one consideration: when split_content_into_text_and_thinking() returns multiple TextPart objects, we'd need to decide how to distribute citations across them. The start_index and end_index fields could help with this, or we might attach all citations to the first part.

Would it make sense to include Chat Completions citation support in this PR, or should it be a follow-up? I'm happy to work on the implementation if that would be helpful.

@DouweM
Copy link
Collaborator

DouweM commented Oct 13, 2025

@certainly-param Thanks for looking into this, good to see that we can support this for OpenAI Chat Completions (and derivative APIs) as well, not just Responses. The biggest challenge in this PR is to land on a generic data representation for citations, that includes all the interesting info from OpenAI, Anthropic, and Google's citation representations. Especially Google does things quite differently, and we have to think about different types of citations/annotations, e.g. from web search, from file search, possibly provided by a tool call result... If you want to look into that, be my guest, but the OpenAI Chat Completions support is probably the easiest part of the puzzle :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

3 participants